Instance Weighting for Neural Machine Translation Domain Adaptation

نویسندگان

  • Rui Wang
  • Masao Utiyama
  • Lemao Liu
  • Kehai Chen
  • Eiichiro Sumita
چکیده

Instance weighting has been widely applied to phrase-based machine translation domain adaptation. However, it is challenging to be applied to Neural Machine Translation (NMT) directly, because NMT is not a linear model. In this paper, two instance weighting technologies, i.e., sentence weighting and domain weighting with a dynamic weight learning strategy, are proposed for NMT domain adaptation. Empirical results on the IWSLT EnglishGerman/French tasks show that the proposed methods can substantially improve NMT performance by up to 2.7-6.7 BLEU points, outperforming the existing baselines by up to 1.6-3.6 BLEU points.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cost Weighting for Neural Machine Translation Domain Adaptation

In this paper, we propose a new domain adaptation technique for neural machine translation called cost weighting, which is appropriate for adaptation scenarios in which a small in-domain data set and a large general-domain data set are available. Cost weighting incorporates a domain classifier into the neural machine translation training algorithm, using features derived from the encoder repres...

متن کامل

Resampling Approach for Instance-based Domain Adaptation from Patent Domain to Newspaper Domain in Statistical Machine Translation

In this paper, we investigate a resampling approach for domain adaptation from a resource-rich domain (patent domain) to a resource-scarce target domain (newspaper domain) in Statistical Machine Translation (SMT). We propose two resampling methods for domain adaptation in SMT: random resampling and resampling for instance weighting. The random resampling randomly adds sentence pairs from the re...

متن کامل

Discriminative Instance Weighting for Domain Adaptation in Statistical Machine Translation

We describe a new approach to SMT adaptation that weights out-of-domain phrase pairs according to their relevance to the target domain, determined by both how similar to it they appear to be, and whether they belong to general language or not. This extends previous work on discriminative weighting by using a finer granularity, focusing on the properties of instances rather than corpus component...

متن کامل

Sentence-Level Instance-Weighting for Graph-Based and Transition-Based Dependency Parsing

Instance-weighting has been shown to be effective in statistical machine translation (Foster et al., 2010), as well as crosslanguage adaptation of dependency parsers (Søgaard, 2011). This paper presents new methods to do instance-weighting in stateof-the-art dependency parsers. The methods are evaluated on Danish and English data with consistent improvements over unadapted baselines.

متن کامل

Translation Model Based Weighting for Phrase Extraction

Domain adaptation for statistical machine translation is the task of altering general models to improve performance on the test domain. In this work, we suggest several novel weighting schemes based on translation models for adapted phrase extraction. To calculate the weights, we first phrase align the general bilingual training data, then, using domain specific translation models, the aligned ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017